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(57) Abstract 

nxe present invention provides novel, artificial, restriction cndonucleases comprising the DNA binding motif of the transcription 
factor Spl and the C-terminal DNA cleavage domain of FoKI restriction cndonucleasc. The new restriction cndonucleases arc designated 
herein as "splasc enzyme" and "Hsplase enzyme". Ttie restriction endonuclease of the present invention recognizes a 10 base nucleotide 
sequence and thus, since the probability of these particular 10 base sequences occurring frequently in a DNA sample is low, the restriction 
cndonucleases only rarely cleave the DNA sample. Thus, the Splase enzyme and Hsplase enzyme arc particularly useful cutting DNA 
into large fragments rather than into a myriad of fragments as is common with conventional restriction cndonucleases. The invention also 
relates to artificial fusion genes which encode the artificial restriction cndonucleases mcluding the artificial restriction cndonucleases Splase 
enzyme and Hsplase enzyme. The DNA molecule, diat is the genes that encode the Splase enzyme and Hsplase enzyme are dcsigjiated 
herein as the "Splase gene" and the *'HSplase gene", respectively. The invention also relates to vectors and cells that contain the artifipial 
genes. 
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ARTIFICIAL RESTRICTION ENDOKUCLBASS 

BACKGROUND OF THE INVENTION 

Restriction endonucleases are enzymes that cleave double 
stranded DNA at specific points. Restriction endonucleases have been 
isolated from a variety of organisms euid employed as a valuable tool in 
recombinant DNA technology. Each restriction endonuclease recognizes 
5 a certain base sequence and only that sequence. Each restriction 
endonuclease is a tool permitting flexibility in the manipulation and 
assembly of DNA in- vitro; new restriction endonucleases are desirable 
since they increase the techniques for the in-vitro manipulation of 
DNA. 

10 SUMMARY OF THE INVENTION 

The present invention provides novel, artificial, restriction 
endonucleases comprising the DNA binding motif of the transcription 
factor Spl and the C- terminal DNA cleavage domain of FoKI restriction 
endonuclease. The new restriction endonucleases are designated herein 

15 as "splase enzyme" and "Hsplase enzyme". The restriction endonuclease 
of the present invention recognizes a 10 base nucleotide sequence and 
thus, since the probability of these particular 10 base sequences 
occurring frequently in a DNA sample is low, the restriction 
endonucleases only rarely cleave the DNA sanqple. Thus, the Splase 

20 enzyme and Hsplase enzyme are particularly useful for cutting DNA into 
large fragments rather than into a myriad of fragments as is common 
with conventional restriction endonucleases. The invention also 
relates to artificial fusion genes which encode the artificial 
resriction endonucleases including the artifical resrtiction 

25 endonucleases Splase enzyme and Hsplase enzyme. The, DNA molecule, 
that is the genes that encode the Splase enzyme amd Hsplase enzyme are 
designated herein as the "Splase gene** and the "KSplase gene," 
respectively. The invention also relates to vectors and cells that 
contain the artificial genes. 

30 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a mobility- shift gel in which: DNA substrate Seq. ID 
No, 16 incubated with Splase was applied to Lane 1. The DNA substrate 
Seq. ID No. 16 was applied to Lane 2. The DNA substrate Seq. ID No. 16 
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incubated with splase enzyme in the presence of the competitor DNA, 
which lacked an spl site, was applied to lane 3. The DNA substrate 
Seg. ID No. 16 and competitor DNA was applied to lane 4. The 
competitor DNA incubated with splase enzyme was applied to lane 5 . The 

5 competitor DNA was applied to lane 6. The gel which was subjected to 
electrophoresis in TAE buffer without EDTA, demonstrates the ability of 
the HSplase enzyme to specifically bind to the Spl site; 

Figure 2 A is a gel containing the following: the pUClS DNA 
digested by Hsplase was applied to lane 1; pUCl9 DNA without Hsplase 

0 was applied to lane 2; DNA molecular weight marker VII from Boehringer 
Memnheim was applied to lanes 3; the pUC3spl digested by H-splase was 
applied to lane 4; the pUCSSPl DNA was applied to lane 5; the pUCSSPl 
incubated with Hsplase was applied to lane 6; pUC3Spl was applied to 
lane 7 ; and spUCSSpl digested by Hsplase was applied to lane B . The gel 

5 was then siabject to electrophoresis; the results demonstrate HSplase 
cleaved the closed circular DNA with 5 Spl sites into linear form; 

Figure 2B is a gel in which; pUC19 substrate after double 
digestion by Hsplase and AWN was applied to lane 1; pUCSSPl substrate 
after double digestion was applied to lane 2; and the DNA molecular 

D weight marker VII from Boehringger Maimheim was applied to lane 3. The 
gel was then subjected to electrophoresis; the gel confirms that 
cleavage of the closed circular DNA is specific and near Spl sites; and 
Figure 3 is a gel in which: molecular weight markers were 
applied to lanes 1 and 8; AlWNI linearized pUC19 was applied to lane 2; 

S AlWNI linearized pUC190 incubated with Hsplase was applied to lane 3; 
AlWNI linearized pUCSSpl was applied to lane 4; AlWNI linearized 
pUCSSpl incubated with HSplase was applied to lane 5; BamHI linearized 
pUC-BENN-CAT was applied to lane 6, BamHI linearized pUC-BENN-CAT 
incubated with HSplase was applied to lane 7. The gels were then 

} subject to electrophoresis. The gels demonstrate that HSplase cleaved 
linear DNA specifically near Spl sites. 

DETAILED DESCRIPTION OP THE INVENTION 
The present invention provides artificial chimeric proteins, 
specifically enzymes, designated "splase" enzyme and "Hsplase" enzyme, 
S each of said enzymes comprising the DNA binding motif of the 
transcription factor Spl and the C- terminal DNA cleavage domain of Fokl 
restriction endonuclease . The invention also relates to fusion genes 
encoding the chimeric enzymes splase and Hsplase. 

The gene encoding the splase enzyme is shown in SEQ ID NO 1 cind the 
) amino acid sequence of the splase enzyme is shown in SEQ ID NO 2. The 
gene encoding the Hsplase enzyme is shown in SEQ ID NO 3 and the amino 
acid sequence of the splase enzyme is shown in SEQ ID NO 4. 
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The Solase Enzyme 

The splase enzyme comprises the amino acid sequence as shown in 
Sequence ID No. 2. The Hsplase enzyme comprises the amino acid 
sequence of splase, and also comprises histidine residues, preferably 
5 6 histidines at the N- terminus. The Hsplase enzyme comprises the amino 
acid sequence as shown in Sequence ID No. 4. The first two residues at 
the N- terminus of Splase enzyme are methionine and valine. The next 
sequence of 92 amino acids at the N- terminus includes the 92 amino acid 
zinc finger motif of Spl except that the residue at position 92 of the 

10 motif is valine rather than glycine. This change was made for the 
convenience of gene fusion. The 203 amino acid sequence at the C- 
terminus of the splase enzyme is preferably identical to the 203 amino 
acids of the C- terminal cleavage domain of Fokl. 

The zinc-finger domain of Spl, a 92-amino acid peptide sequence 

15 encoded by the zinc-finger motif of the transcription factor Spl is a 
sequence-specific DNA-binding domain. This domain recognizes several 
closely related 10 base pair Spl DNA binding sites; recognized spl 
binding sites include: 

Secfuence ID No. 5 

20 5'-G{T)GG GCG GG (A) G (A) C (T) -3 ' . 

The Hsplase enzyme further recognizes and bind to the following 
Spl sites: 

5 ' - GGGGCGGGGC- 3 ' Sequence ID No. 6 

5 ' -GAA6CGTGGC-3 ' Secnience ID No. 7 

25 5' -TGGGCGGGAC- 3' Secnience ID No. 8 

5 ' -GGGGAGTGGC- 3 ♦ Sequence ID No. 9 

Thus an example of the splase sites includes: 

Sequence ID No. 10 
5'-G(T)G(A)6(A) GC (A) 6 G (T) G (A) G (A) C (T) -3' . 
30 The splase enzyme cleaves linear DNA and circular DNA. The 

splase enzyme cleaves near the Spl site. 
Gene Construction 

The splase gene was constructed and is shown in Seq ID NO l and 
the Hsplase gene was constructed and is shown in Seq ID NO 3. 
35 Polymerase chain reaction techniques were used to an^lify and subclone 
the DNA fragment encoding the Spl zinc finger domain. The teniplate 
used was plasmid pSpl-516C. The pSpl-516C contains DNA Sequence 
encoding the C- terminal 516 amino acids of Spl. The oligonucleotide 
primer set used is shown below: 

40 

5' primer : 

5'-g tec atg get aaa aag aaa cag cat att tgc cac-3' Sea. ID. No. 11 
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3' primer : ... 

3'-c tgg gtg gtc tta ttc ttc cat ggc c-5' Sea. ID. No. 12 

First, 100 ng of the plasmid pSpl-516C was an^lified in 100 /xl 
reaction volume, which contained: lOmM Tris-HCl at pH 8.8; 10 mM KCl; 
5 1 xnM MgCl,; 2 of each of the primers represented above by Sequence 
ID Nos 11 euid 12; 10 ^m each of dATP, dTTP, dGTP and dCTP, 0.003% Tween 
20 v/v; and 3 units of ULTIma DNA polymerase from Perkin-Elmer • The 
reaction was "hot started" using An^liwax beads from Perkin Elmer. The 
PCR was conducted in a Perkin-Elmer DNA Thermal Cycler en^loying the 

10 following cycling parameters: an initial denaturation of 2 minutes at 
95*^ C was followed by 30 cycles of amplification for 2 minutes at 95 « 
C, 2 minutes at 60» C, 3 minutes at 72^ C. 

The reaction was run in duplicate and the PCR products were 
combined and extracted with phenol /chloroform (1:1) and DKA was 

15 precipitated with ethanol and dissolved in 20 /il of TE which contains 
10 mM Tris-HCl, pH 8.0, and 1 mM EDTA. The PCR product is a DNA 
fragment coding for the zinc finger domain of Spl. To prepare the PCR 
product for ligation, 5 fil of the DNA was digested with Ncol/Kpnl from 
Boehringer Mannheim, Germany according to the manufacturer's 

20 instructions. The digested DNA was separated by electrophoresis on 
0.8% agarose gel in TAE which contained 0.4 M Tris-HCl, 0.013 M sodium 
acetate, and 0.2 mM EDTA, at pH 6.0. The DNA band of approximately 320 
base pairs was cut out of the gel, purified using glassmilk as 
suggested by manufacturer, BIOIOI, Inc., Vista, CA, and dissolved in 10 

25 111 of the ddHjO. 2 ^1 of DNA was then ligated into 0.1 m9 of a 
Ncol/Kpnl cleaved pTrc99A DNA at IS^C for 4 hours. The ligation 
mixture was transformed into JMIOS competent cells and positive clones 
were identified. The resulting plasmid was designated "pTRCSpl". 
Detailed protocol is described in the following: 

30 To clone the gene fragment coding Spl zinc finger domain 

amplified by PCR and subsequently fused it with the PCR fragment coding 
the Fokl nuclease domain, E. coli JM105 was used. The cells were made 
conqpetent by the protocol described below and transformed with the DNA 
ligation mixture. Plasmid DNAs were then isolated from individual 

35 . colonies and positive colonies containing vector with gene fragment 
insert were identified by restriction mapping. Preparation of JMios 
Competent Cells 

JM105 was streaked onto the surface of an M9 agar plate and 
incubated 16 hours at 37<>C. To 4 ml of LB medium (10 g yeast extract, 
40 5 g tryptone and 10 g NaCl per liter) , a single colony from the plate 
was inoculated. The culture was incubated 8 hours at 37®C with 
shaking. 1 ml of this culture was used to inoculate 200 ml SOB media 
(20 g trypton, 5 g yeast extract and 0.5 g NaCl) supplemented with 2 ml 
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each of 1 M MgS04 and 1 M MgClj. The culture was then incubated at 37«>C 
with sh£Ucing for 2-3 hours until absorbance at 600 nm reached 0.3 Then 
100 ml aliquot of the culture was cooled on ice for 10 min. Cells were 
then centrifuged at 6,000 X g for 5 minutes. The supernatant was 
5 discarded and cells were resuspended in 8 ml of ice-cold TFBI (0.59 g 
KOAC, 1.49 g KCl, 0.29 g CaCl^^HjO, 1,98 g MnClj^HjO, and 30 g glycerol 
per 200 ml, pH 5.8) and left on ice for 10 min. The cells were 
centrifuged again at 6,000 X g for 5 minutes. Tthe supernatant was 
discarded and cells were resuspended in 8 ml of ice-cold TFBII (0.42 g 

10 MOPS, 0.15 g KCl, 2.21 g CaCl2»2 HjO and 30 g glycerol per 200 ml, pH 
5.8). The cell suspension (now contpetent) was then frozen in 200 iil 
aliquots in liquid nitrogen and stored at -70*0. 
Transformation of JM105 Competent Cells 

200 nl Of thawed JM105 cells were incubated with 5 ^1 of DMA 

15 ligation mixture on ice for 45 minutes. The cells were then heat 
shocked at 42*^C for 90 seconds and placed on ice for 5 minutes. 800 
111 of LB media was added to the transformation mixture and incubated 
at 37«C for one hour. 200 ptl of the transformation mixture was then 
spread onto an Z«B agar plate containing 100 /xg/ml of ampicillin. The 

20 plate was incubated at 37®C overnight and colonies were then visible. 

To screen for positive colonies, the following miniprep . of 
plasmid DNA was conducted. Zndividiial colonies were inoculated into 
4 ml LB media containing 100 fig/ml an^icillin and cultured at 37«C with 
shaking for 10 hours. 1.5 ml of the culture was centrifuged in a 

25 microcentrifuge for 1 minute and the supernatant was decanted, leaving 
100 /xl with the pellet. The pellet was then resuspended in the 
remaining supernatant by vortexing for 5 second 300 nl of TENS buffer 
(10 mM Tris-HCl, pH 8.0, 0.1 M NaOH, 1 mM EDTA and 0.5% SDS) was added, 
the tube was vortexed for 5 seconds, 150 ^il of 3 M NaOAC, pH 5.2 was 

30 then added and the tube was vortexed again. The saniple was then 
centrifuged in a microcentrifuge for 2 minute and the supernatant was 
transferred to a fresh tube and mixed with two volumes of cold 
ethanol. DNA was brought down by centrifuging in a microcentrifuge 2 
minute After washing twice with 70% ethanol, DNA was dried in a 

35 SpeedVac and dissolved in 50 fcl T£. The solution was spxm again and 
supernatant containing DNA was transferred to a new txibe and stored at 
4»C. 

Screening Positive Colonies bv Restriction Mapping 

To identify clones with gene insert, plasmid DNA from miniprep 
40 was digested with the same restriction enzyme set from Boehringer 
Mannheim Company, Germany, which was \u3ed for cloning (in this case, 
they are Ncol and Kpnl) according to the manufacturer's instructions. 
The mixture was then subjected to electrophoresis on a 0.7% agarose 
gel, stained with ethidium bromide, and the DNA fragments were 
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visualized iznder UV light. Positive clones with gene insert give a 
restriction fragment of the right size for the gene insert. In this 
case, the Ncol/Kpnl fragment coding Spl zinc finger domain was 320 base 

pairs . 

5 The DNA fragment encoding the C- terminal DNA cleavage domain of 

Fokl was also amplified and cloned employing polymerase chain reaction 
techniques. The template was genomic DNA isolated from Flavobacterium 
oiceano/coi tes and the primer set used is shown below: 
5 ' primer : 

10 5 '-egg gta cct aat cgt ggt gtg act aag-3' Sea. ID. NO. 13 
3 ' primer : 

3'-tta ttg ccg etc tat ttg aaa att cct agg cg-5' Sea. ID. NO. 14 

The PGR conditions were similar to those used for amplification 
of the Spl coding sequence » except that 300 ng of the genomic DMA was 

15 used as the template and the annealing temperature was 55 The 
genomic DNA was isolated using the miniprep protocol as described as 
follows. Flavobacterium Okeanokoites was grown at 37«C for 48 hours 
with shaking in 3 ml medium containing per liter 10 g of trypton, 5 g 
of yeast extract, 2 g of NaCl and 4.4 K2HPO4. The cells were harvested 

20 by centrifugation in a microcentrifuge. The cell pellet was 
resuspended in 284 ^1 TE buffer which contained 10 mM Tris-HCl and 1 mM 
KDTA, pH 8.0. Next, 15 ^1 of 10% SDS and 3 fil of protease K (20 mg/ml) 
were added and the sample was incubated at 37<>C for one hour. The 
sample was thoroughly mixed first with 50 /il of 5 M NaCl and then with 

25 40 ^1 Of CTAB/NaCl solution which contained 10% hexadecyl t rime thy 1 
ammoniiun bromide in 0.7 M NaCl. The sample was incubated at 65 «C for 
ten minutes. The sample was subsequently extracted with 0.4 ml of 
chloroform and spun for five minutes in a microcentrifuge. The aqueous 
phase was transferred to a new tube and extracted with equal volume of 

30 phenol/chloroform (v/v 50%) . The aqueous phase was transferred 

to a fresh tube and DNA was precipitated 2.5 volumes of ethanol. The 
sample was washed twice with 70% ethanol and dried in a SpeedVac. The 
sample DNA was dissolved in 20 /il double distilled water and 1 t^l TE 
was added. The DNA concentration was 0.3 mg/ml as determined by UV 

35 absorbance at 260 nm. 

Next the PGR generated Fokl cleavage domain gene fragment was 
digested with KpnI/GamHI and gel purified. The 625 base pair fragment 
was excised, purified and then the fragment was ligated into the 
iqpnI/GamHI- cleaved vector pTrcSpl. The resulting plasmid was 

40 designated "pTrcSplase" . The pTrcSplase fuses, in frame, the 92-amino 
acid DNA binding domain of Spl to the C-terminal 203 amino acid Fokl 
cleavage domain. The entire Splase gene was subsequently cut from 
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pTrc-Splase with NcoI/BamHI and then subcloned into a NcoI/GaraHI 
cleaved pTO-N vector. A description and the construction of the PTO-N 
vector is described in "A Novel Express Vector for High-Level Synthesis 
and Secretion of Foreign Proteins in E. Coli ; Overproduction of Bovine 
5 Pancreatic Phospholipase Aj", T. Deng, et al., Gene 93 . 229-234 (1990). 
The resulting vector which is designated "pTO-Splase" directs the 
synthesis of Splase enzyme with the OmpA signal peptide fused to its N- 
terminus. The OmpA signal peptide directs the synthesized Splase 
enzyme into periplasraic space, thus overcoming the potential toxicity 
10 of Splase enzyme to E. coli. 



Addition of Codons for a Histidine Tag to the Solase Gene 

To facilitate the purification of the recombinant Splase enzyme 

overexpressed in E. coli, codons for six consecutive histidines were 

added to the Splase gene to produce the gene designated as the "HSplase 
15 gene". The 6-histidine tag at the N-terminus facilitates the 

purification of the Hsplase enzyme by metal -chelating chromatography 

with Novagene's His -bind resin. 

The Splase gene was PCR-amplif ied using the pTrcSplase plasmid 

as a template. The 3' primer that was employed is the same primer as 
20 shown in Seq. ID No. 14. The 5' HSplase primer is shown below: 

Sea. ID. No. 15 

5'- g tec atg get cat cac cat cac cat cac aaa aag aaa cag cat att 
tgc cac-3' 



The PCR conditions were the same as that used to amplify DNA 
25 fragment encoding Spl Zinc finger domain. The PCR-generated HSplase 
gene fragment was then digested with Ncol/Kpn I and the 320 base pair 
DNA fragment was gel purified. This fragment was subsequently ligated 
to a Ncol/Kpnl- cleaved pTrc99A. The resulting vector is designated as 
"pTrcHSpl " . 

30 The Kpnl/BaraHI fragment coding for the FoKI nuclease domain was 

cut out from pTrc-Slase and ligated into pTrc-HSpl cleaved by the same 
enzyme set. The resulting plasmid is designated as pTrcH- Splase . The 
sequence of the HSplase gene was confirmed by dideoxy- sequencing of 
pTrcH-Splase dsDNA according to the method of Sanger, F., et al. (1977) 

35 Proc. Natl. Acad. Sci. USA 74, 5463-5467. 



Expression of Solase and HSplase Genes 

« 

Both the Splase and HSplase genes were expressed in pTrc99A and 
pTO-N vectors. The plasmids used were pTRCSplase, pTOSplase, pTRCH- 
Splase and pTOH- Splase. 



wo 96/40882 PCT/US96/09315 



-8- 



Purification of the Hsplase Enzyme from Cvtosol 

E. Coli strain JM105 cells transformed as described above, 
with pTrc*HSplase were grown in 2.5 liters of medium which contained: 
I2g yeast extract, 19 g tryptone, lOmM MgClj per liter, and 100 ;ig/ml 
5 of ampicillin. The cells were grown at 37^C to an OD^qo of 0.6 units and 
cooled to 25^0. The cells were induced with 0.4 mM isopropyl /?-D- 
thiogalactoside. The cells were allowed to grow overnight and 
harvested by centrifugation and then resuspended at 3 ml/g wet weight 
in 1 X binding buffer which contained 6 mM imidazole; 0.5 M NaCl; 20 mM 

10 Tris-HCl at pH 7.9, at 4»C. Next 18 grams of the cells were disrupted 
in ice using a Branson sonicator. Sonication lasted for 45 seconds and 
was repeated 3 times. The sonicated cells were centrifuged at 15,000 
X g for 25 minutes at 4^0. The supernatant was transferred to a new 
tube and the pH was adjusted to 7.9. The HSplase enzyme was purified 

15 from the supernatant by metal -chelating chromatography using His -bind 
resin from Movagene. In brief, the sample was filtered through a 0.45 
^m filter and then loaded onto a 5 ml His*bind column equilibrated with 
IX binding buffer. The column was washed with 10 vol of binding buffer 
and 10 vol of wash buffer which contained 60 mM imidazole, 0.5 M NaCl, 

20 20 mM Tris-HCl at pH 7.9, The column was then eluted with a 20 ml 
elution buffer containing 500 mM imidazole, 0.5 M NaCl, 20 mM Tris-HCl 
at pK 7.9. The fractions that contained the HSplase enzyme were pooled 
and diesalted on a Sephadex G-25 column equilibrated with enzyme buffer 
which contained 20 mM Tris -phosphate, pH 7.7, 50 mM NaCl. DTT was 

25 added to the final sample to bring the sample to a final concentration 
of 5 mM and the enzyme sample was frozen at -70«C. The yield was about 
5-10 mg. 

The Hsplase enzyme has a molecular weight of 35 kDa and 
accounted for about 10-20% of total cellular enzyme. This molecular 
30 weight is in agreement with the calculated molecular weight of 35 JcDa 
for HSplase. The Hsplase enzyme was also purified to greater than 30% 
purity using His -bind column xinder denaturing conditions. 

Control cells which contained only the pTrc99A vector were also 
grown and analyzed as above. The control cells did not make the 
35 Hsplase enzyme. 

The Hsplase enzyme was then evaluated as discussed hereinafter. 

Purification and Refolding of Hsolase from Inclusion Body 

aM105 cells transfected with plasmid pTrc-HSplase were grown in 
4 liters of 2 X TY containing 100 /ig of ampicillin per ml at 21^C to aui 
40 optical density at 600 of 0.7 imits. The cells were then induced 8 
hours with ImM isopropyl /3-D-thiogalactoside. The cells were harvested 
by centrifugation and then resuspended at 2.8 ml/g wet weight in 
binding buffer which contains 6 mM imdazole, 0.5 M NaCl, 20 irtM 
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Tris-HCl, at pH 7.9 and 8 M urea. Cells were disrupted on a Branson 
sonicator for 5 minutes on ice. The sonicated cells were centrifuged 
at 15,000 X g for 25 minutes at 4oc. The inclusion body in the pellet 
was washed once with 50 ml of IX binding buffer and centrifuged again. 
5 The inclusion body pellet was soliibilized in 100 ml of 1 X binding 
buffer, which contained: 6 inM imidazole; 0.5 M NaCl; 20 mM Tris-HCl; 
at pH 7.9 and 6 M urea for 1 hour. The sample was then centrifuged as 
above and the supernatant was mixed with 10 ml of His-bind resin for 
batch binding. Binding was performed in a 250 round bottom flask and 

10 the sample was mixed by rotating for 1 hour. The slurry was then 
poured into a column. The column was washed with 10 volume of binding 
buffer and 10 volume of wash buffer which contained:. 20 mM imidazole; 
0.5 M NaCl; 20 mM Tris-HCl at pH 7.9, and 8 M urea. The column was 
then eluted with a 60 ml elution buffer. The elution buffer contained 

15 200 mM imidazole; 0.5 M NaCl; 20 mM Tris-HCl, at pH 7.9 and B M urea. 
Enzyme concentration of the eluant was around 1.5 mg/ml based on O.D. 
at 280 nm. 

Next, the Hsplase enzyme isolated from the inclusion body, was 
refolded as follows: the eluant was first adjusted to a final 

20 concentration of 4 mM reduced glutathione, 6 M urea and 0.5 mg/ml 
enzyme. The sample was then dialyzed step-wise against refolding 
buffers containing 50 mM Tris- Phosphate, pH 7.9, 100 mM NaCl, 1 tsM 
reduced glutathione with urea concentrations of 4 M, 2 M, and 1 M. The 
refolding buffer was changed every 8-12 hoiirs. Finally, the sanqple was 

25 dialyzed against the refolding buffer with 20% glycerol. After 
refolding, the sample was passed through a 0.41 fm filter and loaded on 
to a 4 ml His -bind coliunn equilibrated with the refolding buffer. The 
column was then washed with the refolding buffer further containing 60 
mM imidazole, then eluted with the refolding buffer fiirther containing 

30 : 500 mM imidazole. The fractions containing enzyme were pooled and 
desalted on a 70 ml Sephadex G-25 column equilibrated with refolding 
buffer containing 20% glycerol. The enzyme sample (c.a. 1.5 mg/ml) was 
stored at -70<>C. This HSplase that had been Isolated from the 
inclusion body and refolded, was then evaluated; the Spl binding site 

35 was shown to specifically bind to Spl sequences. However, the cleavage 
site of this HSplase that had been isolated from the inclusion body and 
refolded, appeared to bind nonspecif ically and to non-specif ically 
degrade DNA. That is, DNA without Spl site was degraded 
nonspecif ically. The enzyme was shown to bind to Spl site 

40 specifically. However, little specific DNA cleavage was observed. 
Accordingly, the Hsplase which was isolated from the inclusion body and 
refolded is less preferred as a recombinant tool. 
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Preparation of Substrates for Hsplaae 

To evaluate the Ksplase as a restriction endonuclease, 
substrates containing Spl sites were constructed. 

Preparation of Plasmid pUCSpI Having a Single Sol Site 
5 A synthetic 26 base pair double- stranded DNA fragment having 

the following base sequence was prepared: 

5'-aat teg cc g ggg egg gge ttc tgc ag-3' Sea. ID No. 16 

3'-gc ggc ccc gcc ccg aag acg tec tag-5' 

The synthetic DNA fragment, Seg. ID No. 16 contained the Spl 
10 site with its flanking sec[uence from humsui metallothionein Ila gene. 
The Spl site is underlined. The DNA fragment Seq. ID No. 16 which has 
EcoRI/BamHI "sticky ends", was then ligated into the poly linker site 
of a EcoRI/BamHI- cleaved pUC19 plasmid to produce the plasmid 
designated "pUCSpl". The plasmid pUCSpl has a single Spl site. Since 
15 the synthetic DNA fragment Seq. ID No. 16 lacked 5 '-phosphate groups, 
only one such fragment could be ligated into each pUC19 plasmid. 
Several clones were picked and a positive clone was identified by 
analyzing its plasmid DNAs for the absence of Kpnl site present between 
EcoRI and BamKI sites in the polylinker region of pUCld. The sequence 
20 of pUCSPl was confirmed by dideoxy- sequencing. 

Preparation of Plasmids Having Multiple SpI sites; pUCSSpI. pUCBSpI. 
PUC4SP1, PUCSSPI and pUC6Sp1 

Plasmids having multiple Spl sites were constructed by 
inserting from one to five copies of a 24-mer synthetic DNA fragment 
25 Seq. ID No. 17, into pUCSpl at the Xbal site. 

The 24-mer synthetic DNA fragment Seq. ID No. 17, which 
contains a single Spl site and Xbal sticky ends, has the following 
sequence : 

30 5' -eta ggc egg ggc ggg ac t tct gca-3' Sea. ID No. 17 
3 '-eg gcc ccg ccc cga aga cgt gat c-5' 

The synthetic DNA Seq. ID No. 17 was then mixed with Xbal 
cleaved pUCSpl. 

DNA and ligated into pUCSPl at the Xbal site. 
35 The ligation mixture was transformed into JM105 cell. The 

number of copy(5) of Spl site DNA inserted was determined by 
electrophoresis of EcoRI/SphI digested plasmid DNAs on 4% metaphor 
agarose gel. The resulting plasmids contain two to six copies of Spl 
site(s) and are termed pUC2SPl, pUCSSPl, pUC4SPl, pUCSSPl and pUC6SPl, 
40 respectively. 
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Evaluation of HSplaae Enzyme 
Specific binding of HSplase to SpI sequence 

The ability of Hsplase enzyme to bind specifically to Spl DNA 
sequence was demonstrated by band-shift assay. 0.5 of the of the 
5 Hsplase enzyme that had been isolated from the inclusion body and 
refolded, was mixed with 0.4 /zg 26 base pair synthetic DNA substrate 
Seq. ID No. 16 in a total volume of 10 ^1. The DNA substrate Seq ID 
No. 16 contained a single Spl site. In a separate control san^le, the 
HSplase was mixed with 0.4 /xg DMA substrate Seq. ID No. 16 and 2 fig 

10 competitor DNA which lacked an Spl site. The samples were inciibated in 
a solution containing 10 rnM Tris*HCl, pH 8.0, 50 DAM NaCl, 2 mM DTT 0.1 
mM ZnCls, at 25^C for 20 minutes. The samples were applied to the 4% 
agarose gel.* The mixture of the DNA substrate Seq. ID No. 16 and 
Hsplase enzyme was applied to Lane 1. The DNA substrate Seq. ID No. 16 

15 was applied to Lane 2. The DNA substrate Seq. ID No. 16 incubated with 
Hsplase in the presence of the competitor DNA was applied to lane 3. 
The DNA sxibstrate Seq. ID No. 16 and competitor DNA was applied to lane 
4. The competitor DNA incubated with Hsplase was applied to lane 5. 
The competitor DNA was applied to lane 6. The gel was s\ibjected to 

20 electrophoresis in TAK buffer without EDTA. The gel is shown in Figure 
1. 

As C£ui be seen by the shifted band in lane 1 of the gel shown 
in Figure 1, splase enzyme readily formed a protein-DNA complex band 
when mixed with the substrate DNA Seq. ID No. 16 which contains one Spl 

25 site. The shifted band is also present in lane 3 which contained the 
Hsplase incubated with DNA svibstrate Seq. ID No. 16 and the competitor 
DNA which lacks Spl site. However, as seen in lane 5, no Hsplase-DNA 
complex was formed when the competitor DNA alone was incubated with 
Hsplase, which demonstrates that the binding of splase to the DNA is 

30 specific. 

Hsplase Cleavage of Circular DNA 

To test the ability of Hsplase to cleave circular DNA, two 
plasmids, pUC19 which lacks an Spl site, cuid pUCSSPl, which has 5 Spl 
sites, were subjected to Hsplase digestion. 

35 The reaction mixtures, which had a total volume of 20 /xl, 

contained the following: 0.5 /ig for pUC19 and pUCSpl, and 1 /ig for 
pUC3Spl and pUC5Spl. Tris-HCl; 2 mM NgCl,; 5 mM DTT; 0.1 mM ZnCl,; and 
10 ng Hsplase. The reaction mixture was incubated at 37^C for 2 hours 
and then applied to 0.7% agarose gel. The pUC19 DNA-Hsplase mixture 

40 was applied to lane 1. pUCl9 DNA without Hsplase was applied to lane 
2. DNA molecular weight marker VII from Boehringer Mannheim was 
applied to lanes 3. The pUCSpl-H- splase mixture was applied to lane 4. 

The pUCSpl DNA was applied to lane 5. The pUC3SPl incubated 
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wlth Ksplase was applied to lane 6, pUCSSpl was applied to lane 7, 
pUCSSpl incubated with Hsplase was applied to lane 8. The gel was then 
subject to electrophoresis, and is shown in Figure 2A. 

As can be seen in Figure 2A, the Hsplase cleaved the pUCSSpl 
5 DNA near the Spl sites and converted the circular DKA to a linear form 
which migrated as a 2.7Kb fragment, as shown in lane 8. Little 
nonspecific nuclease activity is shown in Figure 2A, lane 8. 

To confirm that Ksplase cleaved the plasmid DMA specifically 
near Spl sites, the linearized DNA that is, the DNA bands migrating at 
10 position of 2.7 kb. in Figure 2A were excised from lanes 1 & 8 of 
agarose gel and purified using gene-clean kit from BIO 101, USA. This 
linearized DNA was then digested by a second restriction enzyme, AlwN 
I. Specifically, the excised DNAs were incubated with 10 units of 
AlwNI in buffer supplied by manufacturer at 37«C for 1 hour and applied 
15 to a 0.8% agarose gel, pUCl9 substrate after double digestion was 
applied to lane 1. pUCSSPl siobstrate after double digestion was 
applied to lane 2. The DMA molecular weight marker VII from 
Boehringger Moomheim was applied to lane 3. The gel was then subjected 
to electrophoresis and is shown in Figure 2B. 
20 The linearized plasmid from leme 8 of the gel shown in Figure 

2A was cleaved into two fragments, a 1.9 kb fragment and a 0.8 kb 
fragment. These fragments are shown in Figure 2B, lane 2. This 
confirms that the cut was near the Spl sites. 

In contrast, incubating of the control DNA pUC19 with the 
25 Hsplase enzyme yield minor quantity of DNA migrating with a size of 2.7 
kb as shown in Figure 2A, lane 1. Subsequent digestion of this band by 
AlwNI did not yield any new fragment, as shown in Figure 2, B, lane 1. 
These results indicated that the Ksplase enzyme was specific for Spl 
site and had little nonspecific nuclease activity. 

30 HSplase Cleavage of Linear DNA 

The plasmid, pUC-BENN-CAT which contains the LTR sequence of 
HIV, along with pUC19 and pUCBspl were used as substrates for Hsplase. 
The LTR sequence contains the following spl sites: 
5 ' -GAAGCGTGGC-3 ' Secmence ID No. 7 

35 5' -TGGGCGGGAC-3 ' Sequence ID No. 8 

5 ' -GGGGAGTGGC- 3 ' Sequence ID No. 9 

The plasmid DNAs of pUCld and pUCSSpl were first linearized by 
the restriction enzyme AlwNI, while the PUC-Benn-Cat was linearized by 
digestion with Bam HI. These linearized plasmids were then subject to 
40 restriction digestion with Hsplase. The reaction mixtures had a total 
volume of 10 /il and contained the 10 mM Tris-HCl, 0.75 ng of plasmid 
DNA, 2 mM MgClj, 5 mM DTT, 0.1 mM ZnClj, 100 ixg/ml BSA and 10 ng of 
H-splase. The reaction was incubated at 37»C for 2 or 3 hours emd 
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applied to 0.7% agarose gel. Molecular weight markers were applied to 
lanes 1 and 8; pUC19 linearized by AlwNI was applied to lane 2; AWNI 
linearized pUC19 incubated with Hsplase was applied to lane 3; pUCSSpl 
linearized by AlwNI was applied to lane 4; AlWNI linearized pUCSSpl 
5 incubated with Hsplase was applied to leuie 5; pUC-BENN-CAT linearized 
by BamKI was applied to lane 6. The BamHI linearized pUC-BENN-CAT 
incubated with Hsplase was applied to lane 7. The gels were then 
subject to electrophoresis. 

As shown in Figure 3, the linearized pUC19 control DNA is not 
10 cut by Hsplase (lanes 2-3), while pUCSSpl is cut specifically into 1.9 
kb and 0.8 kb fragment as shown in lames 4-5. Mbst in^ortantly, the 
linearized pUC-BENN-CAT which carries the HIV LTR sequence with three 
consecutive Spl sites was also cut specifically by splase. As shown in 
lanes 6 and 7 of Figure 3, cleavage of the BainHI- linearized pUC-BENN- 
15 CAT by splase near Spl sites generated two fragments of 4.20 kb and 
1.74 kb. 

In addition to the methods used herein to produce the splase 
enzyme, the enzyme may be made using conventional techniques such as 
peptide synthesisers for assembling amino acids. 

Conclusion 

The fusion gene encoding the splase enzyme, a hybrid 
endonudease has been constructed and expressed in E. coli. The splase 
enzyme , purified from E. coli binds specifically to Spl DNA site and 
digests plasmid DNAs carrying Spl sites. The Hsplase enzyme is also a 
relatively specific rare-cutter restriction endonudease, Splase and 
Hsplase enzymes are both efficient, specific and useful for practical 
application in biotechnology techniques. 

The present invention includes: the DNA sequences encoding a 
restriction endonudease comprising the cleavage domain of Fokl and the 
binding domain of Spl, the messenger RNA transcript of such DNA 
sequence ! and the restriction endonudease which recognizes spl sites. 

For example, the DNA sequences include: DNA molecules which, 
but for the degeneracy of the genetic code would hybridize to DNA 
encoding the artificial restriction nuclease, thus the degenerate DNA 
which encodes the artificial restriction nuclease; DNA strands 
complementary to DNA sequences encoding the artificial restriction 
nuclease or portions thereof including DNA in SEQ ID l and 3 or 
portions thereof; heterologous DNA having substantial sequence homology • 
to the DNA encoding the artificial restriction nuclease, including the 
DNA seG[uences in SEQ ID NO 2 and 4 or portions thereof. 

The artificial restriction nuclease includes, for example, 
artificial restriction endonudease comprising the portions of cleavage 
domain of Fokl and the binding domain of Spl and proteins having 



20 



25 



30 



35 



40 



wo 96/40882 



PCT/US96/09315 



-14- 

sxibstantially the same amino acid sequence as shown in SEQ ID NO 2 and 
4 or portions thereof. 

While the invention has been described with a certain degree of 
particularity, various adaptations and modifications can be made 
5 without departing from the scope of the invention as defined in the 
appended claims. 



wo 96/40882 



PCTAJS96/09315 



-15- 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Tsai, Ming-Daw 

Huang , Baohua 

(ii) TITLE OF INVENTION: Artificial Restriction Endonuclease 



(iii) NUMBER OF SEQUENCES: 17 
(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Cal£ee, Halter and Griswold 

(B) STREET: Suite 1800, 800 Superior Avenue 

(C) CITY: Cleveland 

(D) STATE: Ohio 

(E) COUNTRY: U.S.A. 

(F) ZIP: 44114-2688 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE; Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
<B) FILING DATE: 
(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Golrick, Mary £. 

(B) REGISTRATION NUMBER: 34,829 

(C) REFERENCE/DOCKET NUMBER: 18525/00110 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (216) 622-8458 

(B) TELEFAX: (216) 241-0816 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 894 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: YES 
(iv) ANTI- SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..909 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..894 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

ATG GCT AAA AAG AAA GAG CAT ATT TGC CAC ATC CAA GGC TGT GGG AAA 48 
Met Ala Lys Lys Lys Gin His lie Cys His lie Gin Gly Cys Gly Lys 
15 10 15 

GTG TAT GGC AAG ACC TCT CAC CTG CGG GCA CAC TTG CGC TGG CAT ACA 96 
Val Tyr Gly Lys Thr Ser His Leu Arg Ala His Leu Arg Trp His Thr 

20 25 30 

GGC GAG AGG CCA TTT ATG TGT ACC TGG TCA TAC TGT GGG AAA CGC TTC 144 
Gly Glu Arg Pro Phe Met Cys Thr Trp Ser Tyr Cys Gly Lys Arg Phe 
35 40 45 

ACA CGT TCG GAT GAG CTA CAG AGG CAC AAA CGT ACA CAC ACA GGT GAG 192 
Thr Arg Ser Asp Glu Leu Gin Arg His Lys Arg Thr His Thr Gly Glu 
50 55 60 

AAG AAA TTT GCC TGC CCT GAG TGT CCT AAG CGC TTC ATG AGG AGT GAC 240 
Lys Lys Phe Ala Cys Pro Glu Cys Pro Lys Arg Phe Met Arg Ser Asp 
65 70 75 80 

CAC CTG TCA AAA CAT ATC AAG ACC CAC CAG AAT AAG AAG GTA CCT AAT 288 
His Leu Ser Lys His lie Lys Thr His Gin Asn Lys Lys Val Pro Asn 

85 90 95 

CGT GGT GTG ACT AAG CAA CTA GTC AAA AGT GAA CTG GAG GAG AAG AAA 336 
Arg Gly Val Thr Lys Gin Leu Val Lys Ser Glu Leu Glu Glu Lys Lys 

100 105 110 

TCT GAA CTT CGT CAT AAA TTG AAA TAT GTG CCT CAT GAA TAT ATT GAA 384 

Ser Glu Leu Arg His Lys Leu Lys Tyr Val Pro His Glu Tyr lie Glu 
115 X20 125 

TTA ATT GAA ATT GCC AGA AAT TCC ACT CAG GAT AGA ATT CTT GAA ATG 432 
Leu lie Glu lie Ala Arg Asn Ser Thr Gin Asp Arg lie Leu Glu Met 
130 135 140 

AAG GTA ATG GAA TTT TTT ATG AAA GTT TAT GGA TAT AGA GGT AAA CAT 480 
Lys Val Met Glu Phe Phe Met Lys Val Tyr Gly Tyr Arg Gly Lys His 
145 150 155 160 

TTG GGT GGA TCA AGG AAA CCG GAC GGA GCA ATT TAT ACT GTC GGA TCT 528 
Leu Gly Gly Ser Arg Lys Pro Asp Gly Ala lie Tyr Thr Val Gly Ser 

165 170 175 

CCT ATT GAT TAC GGT GTG ATC GTG GAT ACT AAA GCT TAT AGC GGA GGT 576 
Pro He Asp Tyr Gly Val He Val Asp Thr Lys Ala Tyr Ser Gly Gly 

180 185 190 

TAT AAT CTG CCA ATT GGC CAA GCA GAT GAA ATG CAA CGA TAT GTC GAA 624 
Tyr Asn Leu Pro He Gly Gin Ala Asp Glu Met Gin Arg Tyr Val Glu 
195 200 205 

GAA AAT CAA ACA CGA AAC AAA CAT ATC AAC CCT AAT GAA TGG TGG AAA 672 
Glu Asn Gin Thr Arg Asn Lys His He Asn Pro Asn Glu Trp Trp Lys 

210 215 220 

GTC TAT CCA TCT TCT GTA ACG GAA TTT AAG TTT TTA TTT GTG AGT GGT 720 

Val Tyr Pro Ser Ser Val Thr Glu Phe Lys Phe Leu Phe Val Ser Gly 

225 230 235 240 

CAC TTT AAA GGA AAC TAC AAA GCT CAG CTT ACA CGA TTA AAT CAT ATC 768 
His Phe Lys Gly Asn Tyr Lys Ala Gin Leu Thr Arg Leu Asn His He 

245 250 255 
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ACT AAT TGT AAT GGA GCT GTT CTT AGT GTA GAA GAG CTT TTA ATT GGT 816 
Thr Asn Cys Asn Gly Ala Val Leu Ser Val Glu Glu Leu Leu lie Gly 

260 265 270 

GGA GAA ATG ATT AAA GCC GGC ACA TTA ACC TTA GAG GAA GTG AGA CGG 864 
Gly Glu Met He Lys Ala Gly Thr Leu Thr Leu Glu Glu Val Arg Arg 
275 280 285 

AAA TTT AAT AAC GGC GAG ATA AAC TTT TAA 894 

Lys Phe Asn Asn Gly Glu He Asn Phe 
290 295 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 297 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Lys Lys Lys Gin His He Cys His He Gin Gly Cys Gly Lys 
15 10 15 

Val Tyr Gly Lys Thr Ser His Leu Arg Ala His Leu Arg Trp His Thr 

20 25 30 

Gly Glu Arg Pro Phe Met Cys Thr Trp Ser Tyr Cys Gly Lys Arg Phe 
35 40 45 

Thr Arg Ser Asp Glu Leu Gin Arg His Lys Arg Thr His Thr Gly Glu 
50 55 60 

Lys Lys Phe Ala Cys Pro Glu Cys Pro Lys Arg Phe Met Arg Ser Asp 
65 70 75 80 

His Leu Ser Lys His He Lys Thr His Gin Asn Lys Lys Val Pro Asn 

85 90 95 

Arg Gly Val Thr Lys Gin Leu Val Lys Ser Glu Leu Glu Glu Lys Lys 

100 105 110 

Ser Glu Leu Arg His Lys Leu Lys Tyr Val Pro His Glu Tyr He Glu 
115 120 125 

Leu He Glu He Ala Arg Asn Ser Thr Gin Asp Arg He Leu Glu Met 
130 135 140 

Lys Val Met Glu Phe Phe Met Lys Val Tyr Gly Tyr Arg Gly Lys His 
145 150 155 160 

Leu Gly Gly Ser Arg Lys Pro Asp Gly Ala He Tyr Thr Val Gly Ser 

165 170 175 

Pro He Asp Tyr Gly Val He Val Asp Thr Lys Ala Tyr Ser Gly Gly 

180 185 190 

Tyr Asn Leu Pro He Gly Gin Ala Asp Glu Met Gin Arg Tyr Val Glu 
195 200 205 

Glu Asn Gin Thr Arg Asn Lys His He Asn Pro Asn Glu Trp Trp Lys 
210 215 220 
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Val Tyr Pro Ser Ser Val Thr Glu Phe Lys Phe Leu Phe Val Ser Gly 
225 230 235 240 

His Phe Lys Gly Asn Tyr Lys Ala Gin Leu Thr Arg Leu Asn His lie 

245 250 255 

Thr Asn Cys Asn Gly Ala Val Leu Ser Val Glu Glu Leu Leu lie Gly 

260 265 270 

Gly Glu Met lie Lys Ala Gly Thr Leu Thr Leu Glu Glu Val Arg Arg 
275 280 285 

Lys Phe Asn Asn Gly Glu lie Asn Phe 
290 295 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 912 base pairs 

(B) TYPE:- nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: YES 
(iv) ANTI- SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..909 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG GCT CAT CAC CAT CAC CAT CAC AAA AAG AAA CAG CAT ATT TGC CAC 48 
Met Ala His His His His His His Lys Lys Lys Gin His He Cys His 
15 10 15 

ATC CAA GGC TGT GGG AAA GTG TAT GGC AAG ACC TCT CAC CTG CGG GCA 96 
He Gin Gly Cys Gly Lys Val Tyr Gly Lys Thr Ser His Leu Arg Ala 

20 25 30 

CAC TTG CGC TGG CAT ACA GGC GAG AGG CCA TTT ATG TGT ACC TGG TCA 144 
His Leu Arg Trp His Thr Gly Glu Arg Pro Phe Met Cys Thr Trp Ser 
35 40 45 

TAC TGT GGG AAA CGC TTC ACA CGT TCG GAT GAG CTA CAG AGG CAC AAA 192 
Tyr Cys Gly Lys Arg Phe Thr Arg Ser Asp Glu Leu Gin Arg His Lys 
50 55 60 

CGT ACA CAC ACA GGT GAG AAG AAA TTT GCC TGC CCT GAG TGT CCT AAG 240 
Arg Thr His Thr Gly Glu Lys Lys Phe Ala Cys Pro Glu Cys Pro Lys 
65 70 75 80 

CGC TTC ATG AGG AGT GAC CAC CTG TCA AAA CAT ATC AAG ACC CAC CAG 288 
Arg Phe Met Arg Ser Asp His Leu Ser Lys His He Lys Thr His Gin 

85 90 95 

AAT AAG AAG GTA CCT AAT CGT GGT GTG ACT AAG CAA CTA GTC AAA AGT 336 
Asn Lys Lys Val Pro Asn Arg Gly Val Thr Lys Gin Leu Val Lys Ser 

100 105 110 
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GAA CT6 GAG GAG AAG AAA TCT GAA CTT CGT CAT AAA TTG AAA TAT GTG 384 
Glu Leu Glu Glu Lys Lys Ser Glu Leu Arg His Lys Leu Lys Tyr Val 
115 120 125 

CCT CAT GAA TAT ATT GAA TTA ATT GAA ATT GCC AGA AAT TCC ACT CAG 432 
Pro His Glu Tyr lie Glu Leu lie Glu He Ala Arg Asxi Ser Thr Gin 
130 135 140 

GAT AGA ATT CTT GAA ATG AAG GTA ATG GAA TTT TTT ATG AAA GTT TAT 480 
Asp Arg He Leu Glu Met Lys Val Met Glu Phe Phe Met Lys Val Tyr 
145 150 155 160 

GGA TAT AGA GGT AAA CAT TTG 6GT 66A TCA AGG AAA CCG GAC GGA 6CA 528 
Gly Tyr Arg Gly Lys His Leu Gly Gly Ser Arg Lys Pro Asp Gly Ala 

165 170 . 175 

ATT TAT ACT GTC GGA TCT CCT ATT GAT TAC GGT GTG ATC GTG GAT ACT 576 
He Tyr Thr Val Gly Ser Pro He Asp Tyr Gly Val He Val Asp Thr 

180 185 190 

AAA GCT TAT AGC GGA GGT TAT AAT CTG CCA ATT GGC CAA GCA GAT GAA 624 
Lys Ala Tyr Ser Gly Gly Tyr Asn Leu Pro He Gly Gin Ala Asp Glu 
195 200 205 

ATG CAA CGA TAT GTC GAA GAA AAT CAA ACA CGA AAC AAA CAT ATC AAC 672 
Met Gin Arg Tyr Val Glu Glu Asn Gin Thr Arg Asn Lys His He Asn 
210 215 220 

CCT AAT GAA TGG TGG AAA GTC TAT CCA TCT TCT GTA ACG GAA TTT AAG 720 
Pro Asn Glu Trp Trp Lys Val Tyr Pro Ser Ser Val Thr Glu Phe Lys 
225 230 235 240 

TTT TTA TTT GTG AGT GGT CAC TTT AAA GGA AAC TAC AAA GCT CAG CTT 768 
Phe Leu Phe Val Ser Gly His Phe Lys Gly Asn Tyr Lys Ala Gin Leu 

245 250 255 

ACA CGA TTA AAT CAT ATC ACT AAT TGT AAT GGA GCT GTT CTT AGT GTA 816 
Thr Arg Leu Asn His He Thr Asn Cys Asn Gly Ala Val Leu Ser Val 

260 265 270 

GAA GAG CTT TTA ATT GGT GGA GAA ATG ATT AAA GCC GGC ACA TTA ACC 864 
Glu Glu Leu Leu He Gly Gly Glu Met He Lys Ala Gly Thr Leu Thr 
275 280 285 

TTA GAG GAA GTG AGA CGG AAA TTT AAT AAC GGC GAG ATA AAC TTT 909 
Leu Glu Glu Val Arg Arg Lys Phe Asn Asn Gly Glu He Asn Phe 
290 295 300 

TAA 912 

(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Ala His His His His His His Lys Lys Lys Gin His He Cys His 
15 10 15 

He Gin Gly Cys Gly Lys Val Tyr Gly Lys Thr Ser His Leu Arg Ala 
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20 25 30 

His Leu Arg Trp His Thr Qly Glu Arg Pro Phe Met Cys Thr Trp Ser 
35 40 45 

Tyr Cys Gly Lys Arg Phe Thr Arg Ser Asp Glu Leu Gin Arg His Lys 
50 55 60 

Arg Thr His Thr Gly Glu Lys Lys Phe Ala Cys Pro Glu Cys Pro Lys 
65 70 75 80 

Arg Phe Met Arg Ser Asp His Leu Ser Lys His lie Lys Thr His Gin 

85 90 95 

Asn Lys Lys Val Pro Asn Arg Gly Val Thr Lys Gin Leu Val Lys Ser 

100 105 110 

Glu Leu Glu Glu Lys Lys Ser Glu Leu Arg His Lys Leu Lys Tyr Val 
115 120 125 

Pro His Glu Tyr lie Glu Leu lie Glu lie Ala Arg Asn Ser Thr Gin 
130 135 140 

Asp Arg He Leu Glu Met Lys Val Met Glu Phe Phe Met Lys Val Tyr 

145 150 155 160 

Gly Tyr Arg Gly Lys His Leu Gly Gly Ser Arg Lys Pro Asp Gly Ala 

165 170 175 

He Tyr Thr Val Gly Ser Pro He Asp Tyr Gly Val He Val Asp Thr 

180 185 190 

Lys Ala Tyr Ser Gly Gly Tyr Asn Leu Pro He Gly Gin Ala Asp Glu 
195 200 205 

Met Gin Arg Tyr Val Glu Glu Asn Gin Thr Arg Asn Lys His He Asn 
210 215 220 

Pro Asn Glu Trp Trp Lys Val Tyr Pro Ser Ser Val Thr Glu Phe Lys 
225 230 235 240 

Phe Leu Phe Val Ser Gly His Phe Lys Gly Asn Tyr Lys Ala Gin Leu 

245 250 255 

Thr Arg Leu Asn His He Thr Asn Cys Asn Gly Ala Val Leu Ser Val 

260 265 270 

Glu Glu Leu Leu He Gly Gly Glu Met He Lys Ala Gly Thr Leu Thr 
275 280 285 

Leu Glu Glu Val Arg Arg Lys Phe Asn Asn Gly Glu He Asn Phe 
290 295 300 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
, (A) LENGTH: 10 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: YES 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
K6GGC66RRY 

(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GGG6CGG6GC 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYP0THETIC:AL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
GAAGC6TGGC 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 
TGGGCGGGAC 

(2) INFORMATION FOR SEQ ID NO: 9: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
G6GGAGTGGC 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
KRRGMGKRRY 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
GTCCATGGCT AAAAAGAAAC AGCATATTTG CCAC 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: CDWi 
(ill) HYPOTHETICAL: NO 
(iv) ANTX- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CC6GTACCTT CTTATTCTGG TGGGTC 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CGGGTACCTA ATCGTGGTGT GACTAA6 27 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
6C6GATCCTT AAAAGTTTAT CTCGCCGTTA TT 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GTCCATGGCT CATCACCATC ACCATCACAA AAAGAAACAG CATATTTGCC AC 



52 
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(2) INFORM/VTION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

. (C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
AATTCGCCGG GGCGGGGCTT CT6CAG 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
CTAGGCCGGG GCGGGGCTTC TGCA 
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WHAT IS CLAIMED: 

1. An artificial restriction endonuclease coinprising the cleavage 
domain of Fokl and the binding domain o£ Spl. 

2. The endonuclease of claim 1 further comprising from 6 to 10 
histidines at the amino terminus. 

3 . The endonuclease of claim 1 wherein the endonuclease comprises the 
amino acid sequence as shown in Sequence ID No. 2. 

4 . The endonuclease of claim 1 wherein the endonuclease comprises the 
amino acid sequence as shown in Sec[uence ID No. 4. 

5. An artificial gene encoding artificial restriction endonuclease 
of claim 1. 

6. The gene of claim 3 wherein the gene codes for the enzyme as shown 
in Sequence ID No. 2. 

7 . The gene of claim 3 wherein the gene codes for the enzyme as shown 
in Sequence ID No. 4. 

8. The gene of claim 4 wherein the gene codes for the endonuclease 
of claim 2. 

9. The artificial restriction endonuclease of claim 1, wherein the 
endonuclease recognizes the following nucleotide sequence; 5' -G(T)G(A)G(A) 
GC(A)G G(T)G(A)G(A)C(T) -3' . 

10. An artificial restriction endonuclease consisting essentially of 
the cleavage domain of Fokl and the binding domain of Spl. 

11. A vector containing the gene of claim 5. 

12. A cell containing the gene of claim 5. 

13. A method for cleaving both circular and linear DNA samples having 
at least one Spl site comprising the following steps: 

(a) providing the artificial restriction endonuclease of claim 1; 

(b) mixing said artificial restriction endonuclease with the DNA 
sample ; 
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(c) incubating the DNA santple with the artificail restriction 
endonuclease for a time sufficient to provide one or more DNA fragments 
where the DNA was circular, or to provide at least two DNA fragments 
where the DNA was linear. 
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